Attribute Access and Descriptors
================================

We've covered the syntax for accessing attributes, and we've covered how to
create new classes in Python. We talked about one special method --
``__init__``. We're going to cover several special methods that all have to do
with attribute access.

The Four Things You Do With Attributes
---------------------------------------

It's important to remember the 4 things you can do with attributes. These are
the same as the things you can do with any namespace, as attributes for a
namespace for objects.

1. Declare a new attribute (and give it an initial value.)
2. Access the attribute, retrieving its value.
3. Assign to the attribute, giving it a new value.
4. Delete the attribute, removing it from the namespace.

As for variables, you declare an attribute by assigning to it for the first
time, so we will treat these two cases as just "assignment".

When we go about messing with how Python handles attributes, we need to think
about what it is we are trying to accomplish and why. Here are some common
scenarios I encounter.


Default Attributes
""""""""""""""""""

Sometimes I want certain default attributes to exist whether or not I assign
to them in ``__init__``. While I can use class attributes to handle this, I
can also customize the attribute access pattern.

Calculated Attributes
"""""""""""""""""""""

Some attributes may be calculated from the object. For instance, in the case
of complex numbers, the magnitude of the number is calculated as the square
root of the sum of the squares of the real and imaginary parts. I don't want
to calculate this for every complex number, but I do want it to appear as an
attribute. I can modify how the attribute is accessed such that instead of
looking in the ``__dict__`` I can calculate it on the fly.

Note that I don't recommend this sort of behavior. It is surprising to see
that an attribute access has caused a function to be called. It's weird to get
exceptions from them.

Cached Attribute Values
"""""""""""""""""""""""

Sometimes want to cache values after we have accessed them, and we can store
them as an attribute. I'll cover how to do this in different ways, and give my
recommendations.

Indefinite Attributes
"""""""""""""""""""""

Sometimes you don't know what attributes your object will have, or there is
virtually an unlimited number of possibilities. In this case, you simply can't
assign all the possible attributes, so you need to calculate them instead.

Remembering Attribute Manipulation
""""""""""""""""""""""""""""""""""

Another special case arises in things like SQLAlchemy. SQLAlchemy allows you
to create objects that represent rows in database tables. Attributes are the
columns of that row. Assigning to an object's attribute signals your intention
to update a column in that row. However, SQLAlchemy is written such that you
can accumulate all the changes you'd like to make in a transaction before
flushing them out to the database.

Limiting Attribute Values
"""""""""""""""""""""""""

We might want to limit what values we can assign to certain attributes.
Usually we're interested in only values of a specific type but sometimes we
might want to enforce limits to the values. For instance, it makes no sense to
have circles with negative radii, so we might want to limit the values of
``radius`` to numbers greater than or equal to 0.

``__getattribute__(self, name)``
--------------------------------

The "granddaddy" of all the attribute access special methods is
``__getattribute__``. The ``object`` base class of all base classes for all
time ever defines its own ``__getattribute__`` that implements all of the
magical powers I describe in this video.
 
I have never written this method for any class that I have ever defined in all
my years of Python programming. So I'll tell you what it does and the one or
two cases where you might be tempted to use it, and why you should use
something else instead.

This method is called for *every* attribute access. If the attribute is in the
namespace, or a descriptor, or handled by the attribute access methods, this
is still called. (Note, by *every* I really mean *everything you can think of
right now.* There are special attributes that will not be accessed through
this method, for everyone's sanity.)

If you were to write this method, either you would need to call the attribute
access methods, descriptors, etc... then you would need to write that code.
Alternatively, you can raise an ``AttributeError`` and Python will then
proceed to pretend that this method doesn't exist.

Why would you do this? Well, you might want to have very special rules for
when to create a new attribute, what values to assign to attributes, or how to
calculate an attribute on the fly, or what to do when you delete an attribute.
However, each of these cases are handled by the methods below exactly the way
you imagine it should be.

The only case where I think this might be useful is if you want some sort of
super-descriptor scenario, but even then, I'd encourage you to find a way to
make descriptors work. When you include metaclasses in your arsenal, you'll
find that mucking with this is truly unnecessary.

``__dir__(self)``
-----------------

This returns the list of attributes, when the object is called as an argument
to ``dir()``. I never override this, and I don't think you should either.

``__getattr__(self, name)``
---------------------------

This returns a value given a name, but only if the name does not exist in
``__dict__``.

The neat thing about this is that you don't even need the name passed in to be
stored in the ``__dict__`` of the object.  You can just make things up as
needed. Here's an example:

.. code:: python

  class Attributor:
      def __getattr__(self, name):
          return name

  a = Attributor()
  a.foo # -> 'foo'

  a.foo = 6
  a.foo # -> 6, because 'foo' is in self.__dict__

It's been a long time since I've found a good use for this. Descriptors just
do this better.

The only case where this might be beneficial is in the indefinite number of
attributes scenario.

``__setattr__(self, name, value)``
----------------------------------

This special method will be called when you try to assign to any attribute.
And by "any", I really mean "any".

If you want to proceed with the normal assignment behavior of attributes, you
*must* call ``super`` or just ``object.__setattr__(self, name, value)``. You
may also assign directly to the ``__dict__`` attribute of the object.

In this example, I have Vector class that allows you to set the x, y, and z
parameters by attribute assignment.

.. code:: python

  class Vector3:
      def __init__(self):
          self.vector = [0, 0, 0]

      def __setattr__(self, name, value):
          if name == 'x': self.vector[0] = value
          elif name == 'y': self.vector[1] = value
          elif name == 'z': self.vector[2] = value
          else: object.__setattr__(self, name, value)

Like ``__getattr__``, it's been a long time since I've seen a use for this.
Descriptors do it better.

The only case where this might be beneficial is in the indefinite number of
attributes scenario.

Remember that *every* attribute assignment will go through this function, so
plan accordingly!

``__delattr__(self, name)``
---------------------------

This is parallel to ``__setattr__`` except for attribute deletion. I don't
think I've ever written this in my entire career.

It may be useful if deleting an attribute is meaningful. I must not have a
good imagination because I can't think of any case where that would be the
case.

Attribute Dictionary
--------------------

Given the three methods above, you might get the idea that you can write an
Attribute Dictionary, a class that behaves like a dictionary but also allows
you to access the values via the attribute syntax. This has been done multiple
times.

I don't encourage it, however, for a number of reasons, not the least
of which is you can have keys in the dictionary that are not valid attribute
names, and you might have collisions between actual attributes and dictionary
elements.

Descriptors
-----------

At this point, I get to introduce one of the truly unique and scary ideas in
Python: Descriptors.

The confusing part about descriptors is understanding exactly *when* the
special methods get called.

In short, if the descriptor is an attribute of an instance, not the *class*, the
*instance*, then it is not treated as a descriptor, and the attribute access
does nothing special.

But there is one important exception: classes which have attributes that are
descriptors, when accessed directly *through* the class, have the descriptor
special methods invoked. We call classes with attributes that are descriptors
*owner classes*.

Let me try to simplify this all.  Suppose we have the following:

- A descriptor called ``desc`` that defines the special methods ``__get__``,
  ``__set__``, or ``__delete__``.
- A class called ``Owner`` that has an attribute called ``desc`` that is a
  descriptor.
- An instance of class ``Owner`` called ``inst`` but does not have an attribute
  called ``desc`` assigned at the instance level. (Meaning, you never called
  ``inst.__dict__['desc'] = ...``. If you tried ``inst.desc = ...`` that would
  invoke the special methods.)

These are the only four cases where the descriptor methods get called:

- If you call the method directly: ``desc.__get__(...)``,
  ``desc.__set__(...)`` or ``desc.__delete__(...)``.
- When you access, assign, or delete ``desc`` through ``inst``: ``inst.desc``, ``inst.desc = x``,
  ``del inst.desc``.
- When you access ``desc`` through the class ``Owner``: ``Owner.desc``.
  Deleting or assigning ``desc`` through the class ``Owner`` does not invoke
  the special methods.
- Something to do with ``super`` that we'll cover in inheritance and really
  not that important because it hardly ever becomes an issue.

If you're thoroughly confused, don't feel bad. This *is* confusing. The way I
remember it is as follows:

- Descriptors that are just variables are *not* special.
- Instances with descriptor attributes defined at that level are *not*
  special. It is just like descriptors that are variables.
- Classes with descriptor attributes *are* special, both for the class and
  instances of that class.

Example
"""""""

Here's some code that will help clarify things.

.. code:: python
  class Desc:
      def __get__(self, instance, owner):
          print(f"__get__({self}, {instance}, {owner})")
      def __set__(self, instance, value):
          print(f"__set__({self}, {instance}, {value})")
      def __delete__(self, instance):
          print(f"__delete__({self}, {instance})")

  desc = Desc() # desc is a descriptor

  # desc behaves like a normal variable. There is no special behavior.
  desc

  # Accessing the methods directly through desc, the variable.
  desc.__get__(1,2)
  # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, 1, 2)
  desc.__set__(1,2)
  # __set__(<__main__.Desc object at 0x000001F0C17B8D68>, 1, 2)
  >>> desc.__delete__(1)
  # __delete__(<__main__.Desc object at 0x000001F0C17B8D68>, 1)

  # Creating an owner class
  class Owner: pass

  # The specification does not need to be in the class spec, but it can be.
  Owner.desc = desc

  # Accessing the descriptor as an attribute of the class will invoke __get__.
  # Note the arguments!
  Owner.desc
  # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, None, <class '__main__.Owner'>)

  # Deleting the attribute will just delete the descriptor. No special method.
  del Owner.desc
  Owner.desc
  #Traceback (most recent call last):
  #  File "<pyshell#26>", line 1, in <module>
  #    Owner.desc
  # AttributeError: type object 'Owner' has no attribute 'desc'

  # Restoring the descriptor
  Owner.desc = desc
  Owner.desc
  # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, None, <class '__main__.Owner'>)

  # Assigning will overwrite it, no special method.
  Owner.desc = 2
  Owner.desc
  # 2

  # Restoring the descriptor
  Owner.desc = desc

  # inst is an instance of Owner.
  inst = Owner()

  # Accessing through inst invokes the __get__method. Note the arguments!
  inst.desc
  # __get__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>, <class '__main__.Owner'>)

  # Assigning through inst invokes the __set__ method. Note the arguments!
  inst.desc = 5
  # __set__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>, 5)

  # Deleting through inst invokes the __del__method. Note the arguments!
  del inst.desc
  # __delete__(<__main__.Desc object at 0x000001F0C17B8D68>, <__main__.Owner object at 0x000001F0C1831C88>)

  # Remove the descriptor from Owner.
  del Owner.desc

  # There is no desc attribute on inst anymore.
  inst.desc
  #Traceback (most recent call last):
  #  File "<pyshell#37>", line 1, in <module>
  #    inst.desc
  #AttributeError: 'Owner' object has no attribute 'desc'

  # Let's assign the desc to the instance.
  inst.desc = desc

  # Accessing does nothing special.
  inst.desc
  #<__main__.Desc object at 0x000001F0C17B8D68>

  # Assignment does nothing special
  inst.desc = 2
  inst.desc
  #2

  # Resetting
  inst.desc = desc

  # Deleting does nothing special.
  del inst.desc
  inst.desc
  #Traceback (most recent call last):
  #  File "<pyshell#45>", line 1, in <module>
  #    inst.desc
  #AttributeError: 'Owner' object has no attribute 'desc'


Data vs. Non-data Descriptors
"""""""""""""""""""""""""""""

You may hear people talk about "data" or "non-data" descriptors. This is
rather easy to explain:

- Data Descriptors have either or both ``__set__`` and ``__delete__`` defined.
- Non-data descriptors don't have either defined.

Note that you *can* have a descriptor that doesn't have ``__get__`` defined,
but I have never seen a use for this. The net effect of such a descriptor is
that you can *only* assign to it or delete. If you tried to access it, it
would raise an ``AttributeError``.

``__get__(self, instance, owner=None)``
"""""""""""""""""""""""""""""""""""""""

This special method is called whenever the descriptor is accessed using one of
the four methods above. Note that ``owner`` may or may not be set to something
other than ``None``.

- If the descriptor was accessed directly through the class, IE,
  ``desc_class.desc``, then ``owner`` is ``None`` and ``instance`` is
  ``desc_class``.
- If the descriptor was accessed through an instance of the class, IE,
  ``instance.desc``, then ``instance`` is ``instance`` and ``owner`` is
  ``desc_class``.

You should return the value that should be the value of the attribute for this
descriptor. Note that you are *not* given the name that was used to find the
attribute, so unless you stored it previously, you won't have that available.
(We'll talk about ``__set_name__`` later.)

This is actually a pretty big problem to solve and it causes a bit of a
headache. See, each instance of a class with a descriptor for an attribute is
using the *same* descriptor for attribute access. This means you need to know
in this code *which* attributes to look at in the instance to calculate the
value of this attribute access lookup.

However, with a bit of imagination, you can come up with good solutions. And
Python 3.6 gave us ``__set_name__``, which will help.

This function should either return a value (remember that no return statement
means ``return None``) or raise ``AttributeError`` if the attribute shouldn't
exist.

Typically, especially in the case of cachign attribute values, you'll want to
store the value you calculated in the instance of the class so that you don't
have to recalculate it again. This means that I typically see the following
pattern for this method:

.. code:: python

  class MyDescriptor:
      def __get__(self, instance, owner=None):
          if owner: # We're being accessed through an instance
              try:
                  return instance._cached[self.name]
              except KeyError:
                  pass

              value = ...
              instance._cached[self.name] = value
              return value
          else: # We're being access through the class
              return self

In order to use the descriptor, we need to do something like this:

.. code:: python

  class MyClass:

      foo = MyDescriptor()
      foo.name = 'foo'

Using the descriptor looks like this:

.. code:: python

  MyClass.foo # -> __get__(self, MyClass, None)
  a = MyClass()
  a.foo # -> __get__(self, a, MyClass)

When should we use descriptors? Pretty much anytime we want to override the
default behavior of attribute access. There may be a descriptor you want that
does what you want (we'll look at ``property``, ``classmethod`` and
``staticmethod`` in this video), so I'd typically use one of those, especially
``property``. Rarely do I ever write an entirely new descriptor.


``__set__(self, instance, value)``
""""""""""""""""""""""""""""""""""

This is called when you try to assign to a descriptor using one of the four
methods I mentioned early.

Again, we run into the same problem we have with ``__get__`` and names of
attributes. Unless you've recorded the name of the attribute, you won't know
what name the attribute was accessed through.

Typically, we use ``__set__`` to modify the value before storing it,
especially if we want to make sure that the value is of an acceptable type or
value. However, we might also want to store the value in something other than
the attribute's namespace under the same name.

Here's a typical pattern I might see for a ``__set__`` method:

.. code:: python

  class Desc:
      def __set__(self, instance, value):
          instance.__dict__['_'+self.name] = int(value)

And the ways it might get invoked:

.. code:: python

  class Owner:

      foo = Desc()
      foo.name = 'foo'

  Owner.foo = "5"    # Owner._foo <- 5
  instance = Owner()
  instance.foo = 7.0 # instance._foo <- 7

As for ``__get__``, I don't tend to write my own descriptors. ``property``
actually does everything I need.

``__delete__(self, instance)``
"""""""""""""""""""""""""""

This method, not to be confused with ``__del__``, which is invoked when the
object is garbage collected, is invoked when a descriptor is deleted under one
of the four special ways mentioned above.

I don't have a whole lot to say about this, other than you probably want to
use ``property`` instead. 

``__set_name__(self, owner, name)``
"""""""""""""""""""""""""""""""""""

This isn't really a descriptor method, but it goes along closely with it. What
python does is when it creates the class (in ``type(name, bases, dict)``) it
searched the ``dict``, the namespace of the class, looking for values with
this method. If it sees it, then it will call ``value.__set_name__(value,
new_class, name)``, where the name is the name of the value in that namespace.
It's quite convenient, especially when you think about how hard it is to get
the name of the attribute.

If we didn't have this, we have to explicitly set the name, as I did in the
examples above. We could use descriptors, or parameters to new instances of
objects or whatnot. This greatly simplifies that process.

Built-In Descriptors
--------------------

There are three built-in descriptors that I will mention here. All three are
decorators, and serve a very special purpose.

``@staticmethod``
""""""""""""""""

Sometimes you want a method on a class that doesn't rely at all on the class
attributes or instance attributes. In this case, ``staticmethod`` provides a
convenient decorator:

.. code:: python

  class MyClass:

      @staticmethod
      def sum(a, b): return a+b

  MyClass.sum(1,2) # 3
  a = MyClass()
  a.sum(1,5) # 6

``@classmethod``
"""""""""""""""

Parallel to ``@staticmethod`` is this decorator, which ensures that the first
parameter is always the class, even when it is invoked from an instance of the
class.

.. code:: python

  class MyClass:

      @classmethod
      def what_class_am_i(cls):
          return cls

  MyClass.what_class_am_i()
  a = MyClass()
  a.what_class_am_i()

This is convenient because when you invoke classmethods without it, you have
to pass in *something* as the first parameter.

I use this especially when I am creating a singleton-style class. Why not just
use the class as the singleton? You'll see things like this a lot with
libraries like Flask and CherryPy.


``property``
""""""""""""

This descriptor is arguable one of the most useful descriptors ever invented.
Indeed, it can be said that descriptors were invented specifically to make
``property`` possible.

99.9% of the time, when you want to modify how attributes are accessed,
assigned, or deleted, ``property`` has you covered.

Let's look at how it might be used:

.. code:: python

  class MyClass:

      @property
      def a(self):
          return self.b*6

      @a.setter
      def a(self, value):
          self.b = value/6

      @a.deleter
      def a(self):
          del self.b

  MyClass.a
  i = MyClass()
  i.a # Attribue Error - no b!
  i.a = 1
  i.b # 0.16666...
  i.a # 1.0
  del i.a
  i.b # AttributeError
  i.a # AttributeError -- no b!

Note that assigning to or deleting  ``MyClass.a`` will obliterate the
descriptor, removing it completely.

Keep in mind that creating setters or deleters for the property is entirely
optional.

It is really fun to figure out how ``property`` is implemented. Keep in mind
the following:

- After the setter and deleter decorators are applied, Python will assign the
  results to the ``a`` variable in the class namespace. How does ``property``
  not overwrite itself?
- How does the ``property`` know to do the right thing for ``MyClass.a = ...``
  and ``del MyClass.a``?


``__slots__``
-------------

Before I let you go, I want to mention ``__slots__``. In the class suite, if you
define a variable called ``__slots__``, then Python won't create a
``__dict__`` for the class. Instead, it reserves a spot for each of the
attribute names listed in ``__slots__``.

If you try to assign new attributes, an ``AttributeError`` will be raised.

``__slots__`` creates a special descriptor for each attribute. That means you
can't use the class attributes as defaults -- they will get overridden.

There are some more caveats and warnings and I encourage you to read the
Python documentation on it if you think this might be useful.

Speaking of which, when is this useful? If you're creatings tons and tons of
instances and you are worried about the overhead of each object having its own
``__dict__``, this is a way you can reduce the creation time and memory
overhead of creating new instances of the class. That's about all it is useful
for.